Compressive Reinforcement Learning with Oblique Random Projections
نویسندگان
چکیده
Compressive sensing has been rapidly growing as a non-adaptive dimensionality reduction framework, wherein high-dimensional data is projected onto a randomly generated subspace. In this paper we explore a paradigm called compressive reinforcement learning, where approximately optimal policies are computed in a lowdimensional subspace generated from a high-dimensional feature space through random projections. We use the framework of oblique projections that unifies two popular methods to approximately solve MDPs – fixed point (FP) and Bellman residual (BR) methods, and derive error bounds on the quality of approximations obtained from combining random projections and oblique projections on a finite set of samples. We investigate the effectiveness of fixed point, Bellman residual, as well as hybrid least-squares methods in feature spaces generated by random projections. Finally, we present simulation results in various continuous MDPs, which show both gains in computation time and effectiveness in problems with large feature spaces and small sample sets.
منابع مشابه
Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration
In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction f...
متن کاملLSTD with Random Projections
We consider the problem of reinforcement learning in high-dimensional spaces when the number of features is bigger than the number of samples. In particular, we study the least-squares temporal difference (LSTD) learning algorithm when a space of low dimension is generated with a random projection from a highdimensional space. We provide a thorough theoretical analysis of the LSTD with random p...
متن کاملBellman Error Based Feature Generation using Random Projections on Sparse Spaces
This paper addresses the problem of automatic generation of features for value function approximation in reinforcement learning. Bellman Error Basis Functions (BEBFs) have been shown to improve policy evaluation, with a convergence rate similar to that of value iteration. We propose a simple, fast and robust algorithm based on random projections, which generates BEBFs for sparse feature spaces....
متن کاملMachine Learning and Non-Negative Compressive Sampling
The new emerging theory of compressive sampling demonstrates that by exploiting the structure of a signal, it is possible to sample a signal below the Nyquist rate—using random projections—and achieve perfect reconstruction. In this paper, we consider a special case of compressive sampling where the uncompressed signal is non-negative, and propose a number of sparse recovery algorithms—which ut...
متن کاملEfficient Machine Learning Using Random Projections
As an alternative to cumbersome nonlinear schemes for dimensionality reduction, the technique of random linear projection has recently emerged as a viable alternative for storage and rudimentary processing of high-dimensional data. We invoke new theory to motivate the following claim: the random projection method may be used in conjunction with standard algorithms for a multitude of machine lea...
متن کامل